26 research outputs found

    Structural Basis of Gate-DNA Breakage and Resealing by Type II Topoisomerases

    Get PDF
    Type II DNA topoisomerases are ubiquitous enzymes with essential functions in DNA replication, recombination and transcription. They change DNA topology by forming a transient covalent cleavage complex with a gate-DNA duplex that allows transport of a second duplex though the gate. Despite its biological importance and targeting by anticancer and antibacterial drugs, cleavage complex formation and reversal is not understood for any type II enzyme. To address the mechanism, we have used X-ray crystallography to study sequential states in the formation and reversal of a DNA cleavage complex by topoisomerase IV from Streptococcus pneumoniae, the bacterial type II enzyme involved in chromosome segregation. A high resolution structure of the complex captured by a novel antibacterial dione reveals two drug molecules intercalated at a cleaved B-form DNA gate and anchored by drug-specific protein contacts. Dione release generated drug-free cleaved and resealed DNA complexes in which the DNA gate instead adopts an unusual A/B-form helical conformation with a Mg2+ ion repositioned to coordinate each scissile phosphodiester group and promote reversible cleavage by active-site tyrosines. These structures, the first for putative reaction intermediates of a type II topoisomerase, suggest how a type II enzyme reseals DNA during its normal reaction cycle and illuminate aspects of drug arrest important for the development of new topoisomerase-targeting therapeutics

    Exploiting and assessing multi-source data for supervised biomedical named entity recognition

    Get PDF
    Motivation: Recognition of biomedical entities from scientific text is a critica l component of natural language processing and automated information extraction platfo rms. Modern named entity recognition approaches rely heavily on supervised machine learning tech niques, which are critically dependent on annotated training corpora. These approaches have been shown to perform well when trained and tested on the same source. However, in such scenario, the performance and evaluation of these models may be optimistic, as such models may not necessarily generalize to in dependent corpora, resulting in potential non-optimal entity recognition for large-scale tagging of widel y diverse articles in databases such as PubMed. Results: Here we aggregated published corpora for the recognition of bio molecular entities (such as genes, RNA, proteins, variants, drugs, and metabolites), identi fied entity class overlap and performed leave-corpus-out cross validation strategy to test the efficiency o f existing models. We demonstrate that accuracies of models trained on individual corpora decre ase substantially for recognition of the same biomolecular entity classes in independent corpora. Thi s behavior is possibly due to limited generalizability of entity-class-related features captured by i ndividual corpora (model “overtraining”) which we investigated further at the orthographic level, as well as potenti al annotation standard differences. We show that the combined use of multi-source training corpora re sults in overall more generalizable models for named entity recognition, while achieving comparab le individual performance. By performing learning-curve-based power analysis we further identified that performance is often not limited by the quantity of the annotated data

    Genomic-driven nutritional interventions for radiotherapy-resistant rectal cancer patient

    Get PDF
    Abstract Radiotherapy response of rectal cancer patients is dependent on a myriad of molecular mechanisms including response to stress, cell death, and cell metabolism. Modulation of lipid metabolism emerges as a unique strategy to improve radiotherapy outcomes due to its accessibility by bioactive molecules within foods. Even though a few radioresponse modulators have been identified using experimental techniques, trying to experimentally identify all potential modulators is intractable. Here we introduce a machine learning (ML) approach to interrogate the space of bioactive molecules within food for potential modulators of radiotherapy response and provide phytochemically-enriched recipes that encapsulate the benefits of discovered radiotherapy modulators. Potential radioresponse modulators were identified using a genomic-driven network ML approach, metric learning and domain knowledge. Then, recipes from the Recipe1M database were optimized to provide ingredient substitutions maximizing the number of predicted modulators whilst preserving the recipe’s culinary attributes. This work provides a pipeline for the design of genomic-driven nutritional interventions to improve outcomes of rectal cancer patients undergoing radiotherapy
    corecore